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Abstract 


Based on molecular modeling techniques we constructed a rational 3D model of ORF3 in SARS coronavirus (SARS-CoV). Our studies 
suggest that the function of ORF3 could be involved in FAD/NAD binding according to its predicted structure and comparison with other 
structure neighbors. Furthermore, we identified three pairs of non-canonical N—H---7t interactions in the structure of ORF3, which can make 
contributions to the stability of protein structure. These results provide important clues for better understanding of SARS-CoV ORF3 and 


trying new therapeutic strategies. 
© 2004 Elsevier B.V. All rights reserved. 
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1. Introduction 


Sequence analysis of SARS coronavirus genome reveals 
that it contains five major open reading frame (ORFs) that 
encode a polymerase, and S, M, E and N proteins like those 
of other coronavirus. However, the nine potential ORFs are 
not found in other coronaviruses [1]. Theoretically all these 
proteins can be used as targets in drug and vaccine design. 
However, there are some difficulties in understanding the 
functions of these unknown proteins due to very poor 
sequence homology with proteins available in the Protein 
Data Bank. 

The 3D jury system [2] utilizes a global network of 
independent structure prediction servers to detect patterns 
of structural similarity between diverse models and select 
the correct fold from a set of borderline predictions. An 
exciting finding based on such a method [3] is that the 
mRNA cap-! methyltransferase function has been 
assigned to the nsp13 protein of the SARS coronavirus 
(3D jury score > 100). In this study, we started with meta- 
server 3D jury system for fold recognition study of ORF3, 
constructed rational molecular model, hence to understand 
the potential function of ORF3 in terms of tertiary 
structure. 
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2. Materials and methods 


The sequence of ORF3 protein in SARS-CoV was 
downloaded from GenBank (NP_828851) and used for fold 
prediction by 3D Jury system [2], it is a comprehensive 
protein structure prediction servers including more than 10 
novel fold recognition methods, which made a dramatic 
impact on the critical assessment of protein structure 
prediction (CASP-5) in 2002. The proteins with a sufficiently 
high 3D score were used as templates to construct 3D models 
of ORF3 using the MODELLER program [4]. The quality of 
3D models was evaluated by ProQ program [5] and the best 
model was used for further analyses. Specifically, in order to 
get possible information about the function of ORF3, VAST 
(http://www.ncbi.nlm.nih.gov/Structure/V AST/vastsearch. 
html), DALI (http://www.ebi.ac.uk/dali/) and CE [6] pro- 
grams were employed to search the structure neighbors of 
ORF3 protein. The structural comparison was performed by 
LGA [7]. Finally, NCI program [8] was used to identify non- 
canonical interactions in protein structures. The visualization 
of 3D structure was generated by PROTEINEXPLORER 
(http://www.proteinexplorer.org). 


3. Results and discussion 


The 3D Jury system found three significant hits (3D 
score > 90) which have a similar fold to ORF3 (threading 
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Table 1 


The sequence alignment between ORF3 and 1LVL and the secondary structure 


EEEEEEEEEEE HHHHHHHH HHHHHHHH EE 
ORF3 MRFFTLRSITAQPVKI DNAS PASTVHATATI PLOASLPFGWLVIGV--AFLAVFOSATKI 
1LVL =SSSSSS5SSSSeSS=SSS== QOTIQTTLLI------- IGGGPGGYVAAIRAG-QLGIPT 
EEE EEHHHHHHHHHHHHHHHHHHHHHHHHHHHH HHHHHH 
ORF3 IALNKRWQLALYKGFOFICNLLLLEVTIYSHLLLVA-------------- AGMEAQFLYL 
ILVL VLV--EGQALGGTCLNIGCI PSKALITHVAEBOQFHOASRFTEPSPLGISVASPRLDIGQSVA 
HHHHHHHHHHHHHHHHHHHHHEEER EEEE EEEEE E EEE 
ORF3 YALIYFLOCINACRI IMRCWLCWKCKSKNPLLYDANYFVCW-HTHNYDYCIPYNS--VTD 
1LVL WKDGIVDRLTTGVAALLKKHGVKVVHGWAKVL-DGKOQVEVDGORI QCEHLLLATGSSSVE 
EEEEE HHH EEEEEEEEEEEEEEEER EEE 
ORF3 TIVVTEGDGISTPKLKEDYQIGGYSEDRHSGVKDYVVVHGYFTEVYYQLESTOQITTDTGI 
ILVL LPRRPRTKGFNLECLDLKMNGAAIAI DERCOTSMHNVWAI GD--VAGEPMLAHRAMAQG-— 
HHHHHHHHHH 
ORF3 ENATFFIFNKLVK-DPPNVOQIHTIDGSSGVANPAMDPIYDEPTTTTSVPL 
ILVL EMVARTIAGKARRE He = SsSSSSSsSSS SS SSS S SSeS SSeS SaaS 


server PCONS2): 1LVL (Pseudomonas putida lipoamide 
dehydrogenase, 3D score 102), 3GRS (human glutathione 
reductase, 3D score 99), 1GES_A (Escherichia coli 
glutathione reductase, 3D score 95.5). Using ILVL, 3GRS 
and 1GES_A as templates the corresponding 3D models for 
ORF3 were generated and the quality of protein model was 
evaluated by the ProQ program with two measurements. 
The results are listed below: ILVL (ProQ-LG=2.826, 
ProQ-MX = 0.249), 3GRS (ProQ-LG= 1.498, ProQ-MxX = 
0.146), 1GES (ProQ-LG=1.601, ProQ-MX=0.14). This 
means all 3D models of ORF3 based on above templates are 
‘correct model’, ILVL is almost “good model’ (the cutoff 
value of two protein quality measurements is: ProQ-LG> 
1.5 or ProQ-MX>0.1 for correct model, and ProQ-LG> 3 
or ProQ-MX>0.5 for good model). So we take the 3D 
model built on template ILVL as the 3D model of ORF3. 
The alignment between ORF3 and 1LVL and secondary 


MDL 


Fig. 1. The predicted 3D model of SARS-CoV ORF3 based on template 
ILVL (Pseudomonas putida lipoamide dehydrogenase), which consists of 
5 a helices and 6 B sheets. 


structure are shown in Table 1. Fig. 1 shows the 3D model of 
ORF3 built on template 1LVL, which consists of 5 « helices 
and 6 B sheets. Three trans-membrane regions (34-56, 
77-99, and 103-125) predicted by Marra et al. [1] 
correspond to three helices regions in the alignment: 
43-58, 71-98, and 103-129. The additional two helices 
regions are also existed in the structure: 3-12 and 138-147. 

Naturally, a question arises: what information about 
ORF3’s function we can get from its 3D structure? The 
above three templates (ILVL, 3GRS and 1GES), which 
belong a FAD/NAD-linked reductase family, lead us to the 
speculation that ORF3 may be a protein related to 
FAD/NAD-binding. This seems consistent with the 
speculation that ORF3 may encode a protein related to 


Fig. 2. The superposition of SARS-CoV ORF3 (white) with other 
dehydrogenases: 1OJT (lipoamide dehydrogenase from bacterium 
Neisseria meningitides, blue), 1JEH (lipoamide dehydrogenase from 
yeast Saccharomyces cerevisiae, gray), 1LPF (lipoamide dehydrogenase 
from bacterium Pseudomonas fluorescens, green), 1EBD (dehydrolipoa- 
mide dehydrogenase from bacterium Bacillus stearothermophilus, red), and 
1DXL (lipoamide dehydrogenase from pea Pisum sativum, yellow). 
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ATP-binding [1], because FAD/NAD is generally used as an 
oxidant to yield ATP [9]. 

In order to collect as many evidence as possible for such 
a speculation, we employed VAST, DALI and CE to search 
the structure neighbors of ORF3. We found that the top hits, 
except the above templates, focus on the following 
dehydrogenases (also belong to the FAD/NAD-linked 
reductase family): 1OJT (lipoamide dehydrogenase from 
bacterium Neisseria meningitides), 1JEH (lipoamide dehy- 
drogenase from yeast Saccharomyces cerevisiae), 1LPF 
(lipoamide dehydrogenase from bacterium Pseudomonas 


fluorescens), 1EBD (dehydrolipoamide dehydrogenase 
from bacterium Bacillus stearothermophilus), and 1DXL 
(lipoamide dehydrogenase from pea Pisum sativum). The 
superpositions between these enzymes and ORF3 are shown 
in Fig. 2. It can be seen from the revised structure alignment 
(Fig. 3) that there are a number of similar structural patterns 
showing conservative residues (bold representation). In 
particular, among them are partially conserved FAD-bind 
motif (marked as ‘#’) and NAD-binding motif (marked as 
‘$’) [10-14]. This suggests that there could be a link 
between ORF3 and FAD/NAD-binding protein indeed. 
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147 
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207 
204 
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192 
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148 
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323 
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HEHE RH 
ASTVHATATI PLOASLPFGWLVIG- -VAFLAVFQSATKIIALNKR- ----- WOLALYK- - 
SADAEYDVVVL- - ----- GGGPGGYSAAFAAADE-GLKVAIVERY- ----- KTLGGVCLN 
TINKSHDVVII------- GGGPAGYVAAT KAA-QLGFNTACVEKR- ----- GKLGGTC- - 
--SQKFDVVVI------- GAGPGGYVAATRAA- QLGLKTACIEKYIGKEGKVALGGTCLN 
- -AIETETLVV------- GAGPGGYVAATRAA-QLGOKVTIVEKG- - ----- NLGGVCL- 
SGSDENDVVII------- GGGPGGYVAAIKAA-QLGFKTTCIEKR------ GALGGTCLN 


G--FQFICNL- --LLLFVTIYS - - -HLLL-- --- - Vora AA------ GMEAQFLY 
va GCI PSKALLHNAAVIDEVRH-- ------- LAANGI - ----- KYPEPELDIDMLR 
LNV- - -GCI PSKALLNNSHLFHOM-HTEAQKRG- - - - - - IDVNGDI------ KINVANFQ 
Vesee= GCI PSKALLDSSYKY- - - -HEAKEAFKVHGIEAK- - - -GV------ TIDVPAMV 
NV- - - -GCIPSKALISASHRYEQAKH- - ------- SEEMGIKAENV-- ---- TIDFAKVQ 
V----- GCI PSKALLHSSHMYHEAKH- - - -- --- - ee FANHGVKVSNVEIDLAAMM 


LYALIYF- -LQCINACRI IMRCWLCWKCKSKNPLLYDANYFVC- -------------- WH 
AYKDGVVSRLTGGLAGMAKSRKVDVIQGDGOQFL- - DPHHLEVSLTAGDAYEQAAPTGEKK 
KAKDDAVKQLTGGIELLFKKNKVTYYKGNGSFED - -ETKIRVTPVDGL- - - -EGTVKEDH 
ARKANIVKNLTGGIATLFKANGVTSFEGHGKLLAN- - KQVEVTGL- --------- DGKTQ 
EWKASVVKKLTGGVEGLLKGNKVEIVKGEAYFVD- -ANTVRVVNG- - --------- DSAQ 
GOQKDKAVSNLTRGIEGLFKKNKVTYVKGYGKFV- -SPSEISVDTI---------- EGENT 


SSSSsss 
THNYDYCIPY-NSVTIDTIVVTE- ---------------------- GDGISTPKLKE- - -- 
IVAFKNCIIAAGSRVTKLPFIPEDPRIIDSSGALALKEVPGKLLI - IGGGIIGLEMGTVY 
ILDVKNIIVATGSEVTPFPGIEIDEEKIVSSTGALSLKEIPKRLTIIGGGIIGLEMGSVY 
VLEAENVIIASGSRPVEI PPAPLSDDIIVDSTGALEFQAVPKKLGVIGAGVIGLELGSVW 
TYTFKNAITATGSRPIELPNFKFSNRILDSTGALNLGEVPKSLVV- IGGGYIGIELGTAY 
VVKGKHIIIATGSDVKSLPGVTIDEKKIVSSTGALALSEI PKKLVVIGAGYIGLEMGSVW 


STLGSRLDVVEMMDGLMQGADRDL- - - VKVWOKONEYRFDNIMVNTKTVAVEPKEDGVYV 
SRLGSKVTVVEFQPQIGASMDGEVAKATOKFLKKOGLDFKLSTKVISAKRNDDKNVVEIV 
ARLGAEVTVLEALDKFLPAADEO- - - - IAKEALKVLTKOGLNIRLGARVTASEVKKKOVT 
ANFGTKVTILEGAGEI LSGFEKOM- - - -AAIIKKRLKKKGVEVVTNALAKGAEEREDGVT 
GRIGSEVTVVEFASEIVPTMDAEI - -RKQFQRSLEKQGMKFKLKTKVVGVDTSGDGVKLT 


wae nnnn---- ---------------------------DYQIGGYSEDRHSGVKDYVVVH 
TFEGANAPKEPQRYDAVLVAAGRAPNGKLI SAEKAGVAVTDRGF I EVDKQMRTNVPHIYA 
VEDTKTNKQENLEAEVLLVAVGRRPY IAGLGAEKI GLEVDKRGRLVIDDQFNSKFPHIKV 
VTFTDANGEQKETFDKLI VAVGRRPVTTDLLAADSGVTLDERGF I YVDDHCKTSVPGVFA 
VTYEANGETKT IDADYVLVTVGRRPNTDELGLEQIGIKMTNRGLIEVDQQCRTSVPNIFA 
VEPSAGGEQTI IEADVVLVSAGRTPFTSGLNLDKIGVETDKLGRILVNERFSTNVSGVYA 
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Fig. 3. The structure alignment between SARS-CoV ORF3 and other dehydrogenases: 1OJT (lipoamide dehydrogenase from bacterium Neisseria 
meningitides), 1JEH (lipoamide dehydrogenase from yeast Saccharomyces cerevisiae), 1LPF (lipoamide dehydrogenase from bacterium Pseudomonas 
fluorescens), \EBD (dehydrolipoamide dehydrogenase from bacterium Bacillus stearothermophilus), and 1DXL (lipoamide dehydrogenase from pea Pisum 


sativum). The bold representations indicate conservative residues. The FAD-binding motif is marked as ‘#’ and the NAD-binding motif marked as ‘$’. 
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Fig. 4. Non-canonical interactions in the structure of SARS-CoV ORF3. 
The residue pairs involved are (colored blue): Tyr160 (donor) and Phe43 
(acceptor), Phe43 (donor) and Tyr206 (acceptor). 


Finally, the non-canonical interactions in ORF3 protein 
structure were identified by NCI program and the results 
showed that there are three pairs of main chain-side chain 
interactions: Tyr160 (donor) and Phe43 (acceptor), Phe43 
(donor) and Tyr206 (acceptor), and [e232 (donor) and 
Phe231 (acceptor). Among these interactions, Phe43 forms 
two N-H:-:-7t bonds in a sandwich fashion: one donates to 
Tyr206, and one donated by Tyr160, as existed in human 
racl [15] and SARS-CoV main protease [16]. These non- 
canonical bindings fix the big helix Phe43 locate to the two 
loops Tyr160 and Tyr206 locate, hence stabilizes the 
structure of ORF3 protein (Fig. 4). These results can be 
used for rational design of mutagenesis experiments and 
analysis of conservation of interactions at functional sites. 
In recent years, the non-canonical interactions have been 
shown to be important for the stability of protein structure 
[17-19] and ligand recognition [20]. 
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